effective data
Hierarchical Federated Learning for Social Network with Mobility
Chen, Zeyu, Chen, Wen, Li, Jun, Wu, Qingqing, Ding, Ming, Han, Xuefeng, Deng, Xiumei, Wang, Liwei
Federated Learning (FL) offers a decentralized solution that allows collaborative local model training and global aggregation, thereby protecting data privacy. In conventional FL frameworks, data privacy is typically preserved under the assumption that local data remains absolutely private, whereas the mobility of clients is frequently neglected in explicit modeling. In this paper, we propose a hierarchical federated learning framework based on the social network with mobility namely HFL-SNM that considers both data sharing among clients and their mobility patterns. Under the constraints of limited resources, we formulate a joint optimization problem of resource allocation and client scheduling, which objective is to minimize the energy consumption of clients during the FL process. In social network, we introduce the concepts of Effective Data Coverage Rate and Redundant Data Coverage Rate. We analyze the impact of effective data and redundant data on the model performance through preliminary experiments. We decouple the optimization problem into multiple sub-problems, analyze them based on preliminary experimental results, and propose Dynamic Optimization in Social Network with Mobility (DO-SNM) algorithm. Experimental results demonstrate that our algorithm achieves superior model performance while significantly reducing energy consumption, compared to traditional baseline algorithms.
- Oceania > Australia > New South Wales > Sydney (0.14)
- North America > United States (0.14)
- Asia > China > Shanghai > Shanghai (0.05)
- (5 more...)
- Personal (0.93)
- Research Report > New Finding (0.34)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Data-Efficient Training of CNNs and Transformers with Coresets: A Stability Perspective
Gupta, Animesh, Hasan, Irtiza, Prasad, Dilip K., Gupta, Deepak K.
Coreset selection is among the most effective ways to reduce the training time of CNNs, however, only limited is known on how the resultant models will behave under variations of the coreset size, and choice of datasets and models. Moreover, given the recent paradigm shift towards transformer-based models, it is still an open question how coreset selection would impact their performance. There are several similar intriguing questions that need to be answered for a wide acceptance of coreset selection methods, and this paper attempts to answer some of these. We present a systematic benchmarking setup and perform a rigorous comparison of different coreset selection methods on CNNs and transformers. Our investigation reveals that under certain circumstances, random selection of subsets is more robust and stable when compared with the SOTA selection methods. We demonstrate that the conventional concept of uniform subset sampling across the various classes of the data is not the appropriate choice. Rather samples should be adaptively chosen based on the complexity of the data distribution for each class. Transformers are generally pretrained on large datasets, and we show that for certain target datasets, it helps to keep their performance stable at even very small coreset sizes. We further show that when no pretraining is done or when the pretrained transformer models are used with non-natural images (e.g. medical data), CNNs tend to generalize better than transformers at even very small coreset sizes. Lastly, we demonstrate that in the absence of the right pretraining, CNNs are better at learning the semantic coherence between spatially distant objects within an image, and these tend to outperform transformers at almost all choices of the coreset size.
The Secret Weapon Behind Quality AI: Effective Data Labeling - insideBIGDATA
In this special guest feature, Carlos Melendez, COO, Wovenware, discusses best practices for "The Third Mile in AI Development" – the huge market subsector in data labeling companies, as they continue to come up with new ways to monetize this often-considered tedious aspect of AI development. The article addresses this trend and outlines how it is not really a commodity market, but can comprise different strategies for successful outcomes. Wovenware is a Puerto Rico-based design-driven company that delivers customized AI and other digital transformation solutions that create measurable value for government and private business customers across the U.S. The growth of AI has spawned a huge market subsector and increasing interest among investors in data labeling. In the past year, companies specializing in data labeling have secured millions of dollars in funding and they continue to come up with new ways to monetize this often-considered tedious aspect of AI development. Yet, what can be viewed as the third mile in AI development, data labeling, is also perhaps the most crucial one to effective AI solutions.